-
-
Notifications
You must be signed in to change notification settings - Fork 619
fix: correct address validation and memcpy usage in CSphUrl::Connect #3804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code change seems valid
if ( tmp_errno!=0 || !hp || !hp->ai_addr )
however the next change seems unclear
memcpy ( &sin.sin_addr, &( (struct sockaddr_in *)hp->ai_addr )->sin_addr, sizeof(sin.sin_addr) );
and after discussion with the dev team it could be better to rollback that change or create separate commit with that fix change as it is not clear is this change is related to original issue or fixes something different
|
|
||
| #if MYSQL_VERSION_ID>=50515 | ||
| memcpy ( &sin.sin_addr, hp->ai_addr, Min ( sizeof(sin.sin_addr), (size_t)hp->ai_addrlen ) ); | ||
| memcpy ( &sin.sin_addr, &( (struct sockaddr_in *)hp->ai_addr )->sin_addr, sizeof(sin.sin_addr) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter 1: Destination (&sin.sin_addr)
- Purpose: The destination where we want to store the IP address
- Type:
struct in_addr *(pointer to the IP address field in the socket address structure) - Unchanged: This parameter remains the same in both versions because the destination is correct
Parameter 2: Source (the key difference)
Original (Incorrect) Source: hp->ai_addr
- Type:
struct sockaddr *(generic socket address structure) - Problem: This points to the entire socket address structure, not just the IP address
- Structure Layout:
struct sockaddr { sa_family_t sa_family; // Address family (AF_INET, AF_INET6, etc.) char sa_data[14]; // Address data (contains IP + port + padding) } - Issue: Copying the entire structure would copy the address family and other metadata, not just the IP address
Fixed Source: &( (struct sockaddr_in *)hp->ai_addr )->sin_addr
- Type:
struct in_addr *(pointer specifically to the IP address field) - Breakdown of the expression:
hp->ai_addr- Get the generic socket address pointer(struct sockaddr_in *)hp->ai_addr- Cast it to IPv4-specific socket address structure( ... )->sin_addr- Access the IP address field within that structure&( ... )- Get the address of that IP address field
- Structure Layout:
struct sockaddr_in { sa_family_t sin_family; // Address family (AF_INET) in_port_t sin_port; // Port number struct in_addr sin_addr; // IP address ← This is what we want char sin_zero[8]; // Padding }
Parameter 3: Size (the other key difference)
Original (Incorrect) Size: Min ( sizeof(sin.sin_addr), (size_t)hp->ai_addrlen )
- Logic: Copy the minimum of the destination size or the entire addrinfo structure size
- Problem:
hp->ai_addrlenis the size of the entiresockaddr_instructure (16 bytes), butsin.sin_addris only 4 bytes - Result: Would attempt to copy 4 bytes (the minimum), but from the wrong starting position
Fixed Size: sizeof(sin.sin_addr)
- Logic: Copy exactly the size of an IP address (4 bytes for IPv4)
- Correctness: This matches exactly what we're copying - just the IP address field
- Result: Copies exactly 4 bytes from the correct source location to the correct destination
Why This Matters
Memory Layout Visualization
Original (incorrect):
hp->ai_addr points to: [family|port|IP_ADDR|padding]
^
Copying from here (wrong!)
Fixed (correct):
&(...)->sin_addr points to: [IP_ADDR]
^
Copying from here (correct!)
Impact of the Bug
- Wrong data: The original code would copy the address family and port number instead of the IP address
- Connection failure: The socket would have incorrect IP address data, causing connection attempts to fail
- Hostname resolution appeared broken: Even though
getaddrinfo()successfully resolved the hostname, the wrong data was being used for the connection
Consistency with SphinxSE
This fix makes the UDF's hostname resolution consistent with the working SphinxSE engine implementation in ha_sphinx.cc, which uses the same correct approach:
memcpy ( &sin.sin_addr, &( (struct sockaddr_in *)hp->ai_addr )->sin_addr, sizeof(sin.sin_addr) );Result
After this fix, the sphinx_snippets() UDF can now properly resolve hostnames from /etc/hosts and DNS, making it consistent with the SphinxSE engine's behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomatolog does this explanation make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klirichek could you check that the last fix is really good now?
The explanation said it is a right fix for the original issue.
6a20e26 to
32ffca1
Compare
The SphinxSE sphinx_snippets had a erronous test condition on the getaddrinfo return value. With this inverted there was always an error. The entire CSphUrl::Connect also failed to handle IPv6 addresses. As getaddrinfo can return multiple addresses it makes sense to try each of these. Performing freeaddrinfo(hp) on an error condition in getaddrinfo caused a SEGV. Removed some boilerplate #ifdef around very old MySQL versions and unused defines. Thanks to bug report by Misagh Laghaei who kindly refered me to manticoresoftware/manticoresearch#3804 by Sergey Nikolaev <[email protected]> that showed showed the getaddrinfo handling errors, and other odd handling of connection code.
|
Note per above referenced MariaDB/server#4407, I went a little further. Small fix, the following will SEGV on a failed resolution so should be removed. I also testing applying this patch to the Mariadb code, applied cleanly as there's been pretty much no changes on the MariaDB side either. I removed the above SEGV causing fragment, and it passed the unit tests I added into MariaDB. If you'd like to take my MariaDB patch for this you are welcome to do so freely. |
Thanks! Done. |
The SphinxSE sphinx_snippets had a erronous test condition on the getaddrinfo return value. With this inverted there was always an error. The entire CSphUrl::Connect also failed to handle IPv6 addresses. As getaddrinfo can return multiple addresses it makes sense to try each of these. Performing freeaddrinfo(hp) on an error condition in getaddrinfo caused a SEGV. Removed some boilerplate #ifdef around very old MySQL versions and unused defines. This sphinx tests reuse the parsing of configuration files from the test My::Config packaging, which needed to change as sphinx supports multiple listen directives needed for the test. Without this the last specified listen only was recorded in the configuration file. charset_type removed the indexer reported 2.2.11: WARNING: key 'charset_type' was permanently removed from Sphinx configuration. Thanks to bug report by Misagh Laghaei who kindly refered me to manticoresoftware/manticoresearch#3804 by Sergey Nikolaev <[email protected]> that showed showed the getaddrinfo handling errors, and other odd handling of connection code.
Related issue #3799